Dosimetric comparison of deformable image registration and synthetic CT generation based on CBCT images for organs at risk in cervical cancer radiotherapy

Objective Anatomical variations existing in cervical cancer radiotherapy treatment can be monitored by cone-beam computed tomography (CBCT) images. Deformable image registration (DIR) from planning CT (pCT) to CBCT images and synthetic CT (sCT) image generation based on CBCT are two methods for improving the quality of CBCT images. This study aims to compare the accuracy of these two approaches geometrically and dosimetrically in cervical cancer radiotherapy. Methods In this study, 40 paired pCT-CBCT images were collected to evaluate the accuracy of DIR and sCT generation. The DIR method was based on a 3D multistage registration network that was trained with 150 paired pCT-CBCT images, and the sCT generation method was performed based on a 2D cycle-consistent adversarial network (CycleGAN) with 6000 paired pCT-CBCT slices for training. Then, the doses were recalculated with the CBCT, pCT, deformed pCT (dpCT) and sCT images by a GPU-based Monte Carlo dose code, ArcherQA, to obtain DoseCBCT, DosepCT, DosedpCT and DosesCT. Organs at risk (OARs) included small intestine, rectum, bladder, spinal cord, femoral heads and bone marrow, CBCT and pCT contours were delineated manually, dpCT contours were propagated through deformation vector fields, sCT contours were auto-segmented and corrected manually. Results The global gamma pass rate of DosesCT and DosedpCT was 99.66% ± 0.34%, while that of DoseCBCT and DosedpCT was 85.92% ± 7.56% at the 1%/1 mm criterion and a low-dose threshold of 10%. Based on DosedpCT as uniform dose distribution, there were comparable errors in femoral heads and bone marrow for the dpCT and sCT contours compared with CBCT contours, while sCT contours had lower errors in small intestine, rectum, bladder and spinal cord, especially for those with large volume difference of pCT and CBCT. Conclusions For cervical cancer radiotherapy, the DIR method and sCT generation could produce similar precise dose distributions, but sCT contours had higher accuracy when the difference in planning CT and CBCT was large.


Introduction
Cone-beam computed tomography (CBCT) images are widely used in image-guided radiation therapy (IGRT) systems, as they can monitor anatomical variations in clinical treatment. Since anatomical changes during radiotherapy treatment can alter the planned dose distribution [1,2], dose recalculations based on CBCT images are necessary to obtain the fractional dose distribution for patients during radiotherapy. Unfortunately, radiotherapy dose calculations based on CBCT images can be inaccurate due to scattering and other artifacts [3][4][5], and so clinically, CBCT images are mostly utilized during pretreatment imaging for setup verification [6,7]. Researchers are constantly striving to develop methods that can improve CBCT image quality, such as the iterative cone beam CT (iCBCT) of the Varian company, which combines statistical reconstruction and the Acuros CTS scattering correction algorithm [8,9] to achieve uniform imaging with less noise and higher quality. Jarema et al. [10] validated the use of iCBCT for dose calculation in pelvis radiotherapy. Nevertheless, artifacts (cavity artifacts, etc.) still exist in iCBCT images, reducing the image structure recognition and dose calculation accuracy. Further CBCT applications in radiotherapy, such as adaptive radiotherapy, are similarly limited.
Various scholars have proposed methods to minimize dose calculation inaccuracies, which can be roughly classified into two types: projection domain methods and image domain methods. Projection domain methods suppress scatter during projection data acquisition, which can improve CBCT image quality by optimizing projection data [11,12]. Image domain methods improve image quality with image processing algorithms [13][14][15][16][17][18], which can further be divided into 4 main categories: (1) calibration curve plotting between the HU and density (HU-D curve), the most commonly used in the clinic, which can be used to convert CBCT HUs to densities for dose calculation [19]; (2) the density assignment method (DAM), which segments an image into several tissue classes and assigns a suitable density to each class [16]; (3) deformable image registration (DIR) from CT to CBCT, in which the deformed CT images approximately represent the anatomical structures of the CBCT images, and the HU values are sufficiently accurate for dose calculation [20,21]; and (4) synthetic CT (sCT) generation, which provides HU values similar to those of CT and anatomical structures similar to those of CBCT [22][23][24][25]. However, the HU-D curve can be sensitive to artifacts and scattering, and DAM cannot reflect subtle structures. DIR and sCT generation are superior to the other two methods. Barateau et al. [26] compared the 4 methods for H&N radiotherapy and demonstrated that the DIR method and sCT generation appeared to be the most appealing CBCT-based dose calculation methods.
In our previous study [27], the accuracy of the image quality and the structural consistency with CBCT images were compared between the DIR method and sCT generation method. The results demonstrated that both DIR and sCT generation could effectively improve the image quality of CBCT images. Due to the anatomic differences between the planning CT (pCT) and CBCT, especially in bladder volume, the accuracy of DIR appeared unsatisfactory, and the sCT generated by CycleGAN showed better structural consistency with CBCT. On the other hand, the sCT was obtained from trained model parameters, which might produce some artificial structures inconsistent with the CBCT images. The inconsistent anatomy of deformed pCT (dpCT) and artificial structures of sCT may introduce dose calculation errors, which require further validation. In this paper, we implemented dose recalculations on CBCT, pCT, dpCT and sCT images and compared the accuracy of the sCT dose and dpCT dose dosimetrically. Furthermore, the dosimetric accuracy of dpCT contours and sCT contours was analyzed compared with CBCT contours.

Data acquisition
A total of 190 paired CT and CBCT images from 115 cervical cancer patients were collected for retrospective analysis in this study, all of whom were treated with two-course intensity-modulated radiotherapy (IMRT): 20 fractions for the first course and 8 fractions for the second course. Of these 115 patients, 17 patients are in stage I, 43 patients in stage II, 52 patients in stage III and 3 patients in stage 4 (FIGO 2018). The total prescription dose was 50.4 Gy with 1.8 Gy/fraction, the patients who had finished the treatment had two planning CTs. Because there were still patients who had not finished the treatment when we collected the data, so such patients had one planning CT. Then, the paired CBCT was collected in the first fraction as shown in Fig. 1. A total of 150 paired CT-CBCT images were used for training the DIR network and sCT generation network, and the remaining 40 paired images were arranged for the dosimetric comparison.
The CT images were obtained on a PHILIPS Brillian-ceTM Bigbore CT, which has a bore with a diameter of 85 cm. The plane resolution of the CT ranged from 0.962 mm × 0.962 mm to 1.365 mm × 1.365 mm, and the slice thickness was 5 mm. The CBCT images were obtained from a Halcyon 2.0 system (Varian, USA), with plane resolution ranging from 0.908 mm × 0.908 mm to 1.035 mm × 1.035 mm and slice thickness of 2 mm. The range of the CBCT images was mainly concentrated near the clinical tumor target area, with a length of approximately 240 mm. The scanning range of the CT scan was longer than and completely overlapped that of the CBCT scan.

Deformable image registration and sCT generation
The methods underlying the development of the DIR network and sCT generation network are described in detail in our previous work [27]. Briefly, the DIR network is based on a 3D multistage registration network (MSnet), which includes three stages of registration, each of which consists of two down-sampling layers, six ResNet blocks [28] and two upsampling layers. The MSnet model was trained and tested on Nvidia Geforce RTX 3090. The batch was set to 20 when the model in stage1, 4 in stage2, and 1 in stage3. The training required approximately 24 hours for 200 epochs. The sCT generation network is based on the 2D cycle-consistent generative adversarial network (CycleGAN), which mainly consists of two generators (G CBCT-CT , G CT-CBCT ) and two discriminators (D CT , D CBCT ): G CBCT-CT generates the sCT images from the CBCT images, G CT-CBCT generates the sCBCT images from the CT images, D CT identifies the sCT images from the real CT images, and D CBCT identifies the sCBCT images from the real CBCT images. ResNet with 15 ResNet blocks is used as the generator. sCT images are generated by G CBCT-CT with CBCT images. The Cycle-GAN model was trained and tested on Nvidia Geforce RTX 3090. Adam was selected as the model optimizer. The batch was set to 6, the initial learning rate was set to 0.002 and the GAN discrimination rate was set to 0.02. The epoch number was set to 200 and the learning rate decreased linearly from 0.002 to 0 in last 100 epochs.

Image processing
As mentioned above, it was difficult to expand the scanning range of CBCT to cover the entire treatment area; the missing body data could affect the accuracy of dose calculation if it was directly used, and so the area not scanned by CBCT needed to be filled. As shown in Fig. 2, rigid alignment based on the Insight Toolkit (ITK) [29,30] was first implemented from CBCT images to pCT images to obtain rigidly aligned CBCT images (rCBCT). In the process of rigid alignment, the coordinates and resolution of the CBCT images were made consistent with those of the pCT images. Because the information stored in the RP and RD files was based on the pCT images, the rigid alignment of CBCT images could better serve the dose calculation. Second, the dpCT images were obtained by the trained MSnet, and the sCT images were obtained by the trained generator G CBCT-CT . Finally, the area of overlap between the CBCT and pCT images was identified and marked. Outside the overlapping area, the pCT images were copied directly without any change. Within the overlapping area, the CBCT, dpCT or sCT images were used. It is worth mentioning that the field of vision (FOV) gradually decreased at both ends of the CBCT image, so the CBCT images at both ends could not contain complete images of the layer. In the actual operation, the middle 200 mm area was considered the overlapping area even though the full length of the CBCT images was 240 mm.

Delineation and dose calculation
Structure delineation of the pCT images (pCT contours) was completed by experienced senior physicians and used for clinical radiotherapy. The dpCT contours were propagated from the pCT images to the corresponding CBCT images through deformation vector fields and sCT contours were auto-segmented and corrected manually. For the test cases, the physician redelineated the The pCT, CBCT, dpCT and sCT images of 40 test cases were obtained through the above data processing in Fig. 2, whose image parameters (image position patient, image spacing, etc.) were consistent with those of the pCT images except for the HU values. Using the RP and RD files optimized based on the pCT images, the dose recalculations were implemented by a GPU-accelerated Monte Carlo code, ArcherQA, previously developed by our group [31,32]. ArcherQA integrates the GPU acceleration function to quickly and accurately calculate the dose distribution based on CT images. This allowed the pCT, CBCT, dpCT and sCT doses to be obtained quickly by ArcherQA (Dose pCT , Dose CBCT , Dose dpCT , Dose sCT ).
Additionally, the dose discrepancies of organs at risk were analyzed. When analyzing the dose metrics, the dose distribution and contour information are essential. Each of the testing cases had 4 sets of dose distributions (pCT, CBCT, dpCT, sCT) and 4 sets of corresponding contours. If the respective dose distribution and contour information are used to analyze the dose metrics, the results would not be comparable. Therefore, we designed two groups of experiments: 1. Using the CBCT contours as the uniform contour information, the dose metric errors of Dose pCT , Dose CBCT , and Dose sCT were calculated and evaluated compared with Dose dpCT . 2. Using Dose dpCT as the uniform dose distribution, the dose metric errors of the pCT, dpCT and sCT contours were calculated and evaluated compared with CBCT contours.
In this study, the organs for comparison included the bladder, spinal cord, left femoral head, right femoral head and bone marrow, and the dose metrics included Dmean (the mean dose inside the organ) and D2 (the dose received by 2% of the volume). Dose metric errors were calculated using the following formula: M represents the dose metric, including Dmean and D2. M ground truth represents the ground truth value of the dose metric, and M comparison represents the value of the dose metric to be compared. The value of the dose metric error represents the percentage; the smaller the value is, the closer it is to the ground truth. Paired sample t tests were used to evaluate the statistical significance of all the dose-volume parameters. Table 1 showed the gamma pass rates of Dose CBCT and Dose sCT using Dose dpCT as the ground truth. The results in Table 1 indicated that Dose sCT was highly consistent with Dose dpCT . Even at the criterion of 1%/1 mm, the  Table 1 Comparison of gamma passing rates between Dose CBCT and Dose sCT (threshold = 10%, global 3D) *P was calculated by comparing the gamma passing rates of Dose CBCT vs. Dose dpCT and Dose sCT vs. Dose dpCT according to paired sample t tests  Figure 3 showed the visualized image differences and dose comparisons for dpCT, CBCT and sCT. sCT had higher image quality and smaller dose calculation errors than CBCT.

Accuracy of contours
The results in Table 2 showed the dice similarity coefficient (DSC) of pCT contours, dpCT contours and sCT contours using CBCT contours as ground truth. The accuracy of sCT contours was higher than the pCT and dpCT contours, especially in the deformable OARs (small intestine, rectum and bladder). Table 3 showed the dose metric errors for different dose distributions using the CBCT contours as the uniform contour information. The results showed that Dose sCT had the smallest errors in Dmean and D2. The average dose metric errors of Dose CBCT were in the range of 0.756-3.491%, the average dose metric errors of Dose pCT were less than 0.8%, and the average dose metric errors of Dose sCT were less than 0.3%. The results demonstrated the inaccuracy of CBCT-based dose calculation, which was greater than that caused by anatomical structures (Dose pCT vs. Dose dpCT ). Figure 4 showed the dose-volume histograms with different dose distributions for 9th paired data. The sCT-based curve and the dpCT-based curve almost completely overlapped, the pCT-based curve had small errors with the dpCT-based curve, and the CBCT-based curve had the largest deviation compared with the dpCT-based curve. Table 4 showed the dose metric errors for different contours. Considering the anatomical structures and image quality, Dose dpCT was used as the uniform dose distribution to analyze the dose errors caused by different contours. As shown in Table 4, the errors of both Dmean and D2 were smaller for rigid organs (femoral heads and bone marrow) and larger for deformable organs (bladder). We conducted further analysis for different volume changes using the difference in organ volume between pCT and CBCT as the analysis indicator. The 40 cases in the test set were divided into two groups, one with small Diff(V CBCT , V pCT ) and the other with large Diff(V CBCT , V pCT ). Table 4 shows that a larger Diff(V CBCT ,V pCT ) resulted in larger errors, especially for the pCT contours and dpCT contours. For the 20 cases with smaller Diff(V CBCT ,V pCT ), the sCT contours had lower errors, but none of the differences were significant (except for D2 of the spinal cord). For the 20 cases with larger Diff(V CBCT , V pCT ), the sCT contours had comparable or lower errors, and significant differences were observed in the bladder. Figure 5 shows the dose-volume histograms with different contours for 9th paired data.

Discussion
The results of this study suggested that both the DIR and sCT generation methods could yield improved calculation accuracies for doses based on CBCT images for cervical cancer patients. The iCBCT images in the Varian Halcyon 2.0 system are generated by combining statistical reconstruction and the Acuros CTS scattering correction algorithm, greatly improving the dose calculation accuracy. The average gamma pass rate at the 2%/2 mm criterion reached more than 97%. Figure 3 shows that errors persisted between the CBCT-based dose calculations and dpCT-based dose calculations, with many body voxel dose differences greater than 1 Gy. In contrast, the sCT-based dose calculations and dpCT-based dose calculations had higher consistency, and the gamma pass rate at the 1%/1 mm criterion was more than 99%. The consistency between Dose sCT and Dose dpCT illustrated not only the accuracy of sCT for dose calculation but also the accuracy of dose calculation based on dpCT, which was the major reason for using Dose dpCT as the realistic dose distribution in fraction. Moazzezi et al. [33] also described that the scheduled dose was calculated on a simulated CT, which was produced by deformably registering the planning CT to the daily CBCT in Ethos. [34] We considered a few reasons for the high consistency between the dpCT-based dose distribution and sCT-based dose distribution. The dpCT images were deformed from the pCT images to the CBCT images, and   Table 4 Dose metric errors for different contours using dpCT-based dose distribution (%) *P was calculated by comparing the dose metric errors of the dpCT contours vs. CBCT contours and of the sCT contours vs. CBCT contours according to paired sample t tests the sCT images were generated based on CBCT images, both of which had similar anatomical structures to the CBCT images, especially the high geometric similarity of the skin and bony structures. As we found in our previous work [27], the DIR method appeared to be ineffective for large deformable structures composed of soft tissue, and sCT might produce some artificial structures inconsistent with the CBCT images, which caused the slight difference in the anatomical structures of the dpCT images and sCT images. In Fig. 3, the sCT-based dose distribution was highly consistent with the dpCT-based dose distribution, which illustrated that the slight difference in the anatomical structures of the dpCT images and sCT images had little effect on the dose calculations. This result is in agreement with the conclusion of Liu et al. [35]. We believe that large deformable structures (bladder, etc.) were surrounded by other structures of similar HUs, even if the DIR method could not achieve high accuracy; when the HUs were converted into density according to the electron density curve, the small density difference resulted in a small dose difference. For sCT generation, most errors were in the vicinity of organs with similar HUs, which might result in more identification errors in anatomical structures but fewer errors in dose calculation. Table 3 shows the dose metric errors for different dose distributions using uniform contours (CBCT contours). The results were roughly similar to those described in the above discussion: the sCT-based dose distribution had smaller errors, less than 0.3%, than the CBCT-based dose distribution. Table 4 shows that the sCT contours had smaller errors in the bladder and spinal cord, and there were comparable errors in the left femoral head, right femoral head and bone marrow for the pCT, dpCT and sCT contours. When the volume difference in pCT and CBCT images was large, sCT contours had higher accuracy in the deformable organs (small intestine, rectum and bladder). The combined analysis of Tables 3  and 4 showed that the errors caused by the different contours (Table 4) were larger than the errors caused by the different dose distributions (Table 3), which could be clearly seen from the comparison between Figs. 4 and 5. Our previous study [27] demonstrated that the DIR and sCT generation methods both improved the iCBCT image quality effectively, and sCT achieved higher accuracy when the difference between the planning CT and iCBCT was large. In this study, sCT produced a similarly precise dose distribution as dpCT, and the sCT contours had lower errors, demonstrating its clinical superiority over dpCT.
Some limitations should be noted in this study. First, the dosimetric analysis was carried out for 7 organs at risk because it was difficult to accurately identify the target on CBCT images. Second, the sCT contours could not be obtained with high accuracy automatically. Autosegmentation based on sCT images will be completed in our next work.